Two-stage Incremental Working Set Selection for Support Vector Training
نویسندگان
چکیده
We introduce iSVM an incremental algorithm that achieves high speed in training support vector machines (SVMs) on large datasets. In the common decomposition framework, iSVM starts with a minimum working set (WS), and then iteratively selects one training example to update the WS in each optimization loop. iSVM employs a two-stage strategy in processing the training data. In the first stage, the most prominent vector among randomly selected data is added to the WS. This stage results in an approximate SVM solution. The second stage uses temporal solutions to scan through the whole training data once again to find the remaining support vectors (SVs). We show that iSVM is especially efficient for training SVMs on applications where data size is much larger than number of SVs. On the KDD-CUP 1999 dataset with nearly five millions training examples, iSVM takes less than one hour to train an SVM with 94% testing accuracy, compared to seven hours with LibSVM – one of the state-of-the-art SVM implementations. We also provide analysis and experimental comparisons between iSVM and the related algorithms. Keyword Support Vector Machine, Optimization, Decomposition Method, Sequential Minimal Optimization
منابع مشابه
A New Incremental Support Vector Machine Algorithm
Support vector machine is a popular method in machine learning. Incremental support vector machine algorithm is ideal selection in the face of large learning data set. In this paper a new incremental support vector machine learning algorithm is proposed to improve efficiency of large scale data processing. The model of this incremental learning algorithm is similar to the standard support vecto...
متن کاملWorking Set Selection Using Second Order Information for Training Support Vector Machines
Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast convergence. Theoretical properties such as linear convergence are established. Experiments demonstrate that the proposed method is faster...
متن کاملA study on SMO-type decomposition methods for support vector machines
Decomposition methods are currently one of the major methods for training support vector machines. They vary mainly according to different working set selections. Existing implementations and analysis usually consider some specific selection rules. This paper studies sequential minimal optimization type decomposition methods under a general and flexible way of choosing the two-element working s...
متن کاملFast Training of Linear Programming Support Vector Machines Using Decomposition Techniques
Decomposition techniques are used to speed up training support vector machines but for linear programming support vector machines (LP-SVMs) direct implementation of decomposition techniques leads to infinite loops. To solve this problem and to further speed up training, in this paper, we propose an improved decomposition techniques for training LP-SVMs. If an infinite loop is detected, we inclu...
متن کاملIncremental support vector machine algorithm based on multi-kernel learning
A new incremental support vector machine (SVM) algorithm is proposed which is based on multiple kernel learning. Through introducing multiple kernel learning into the SVM incremental learning, large scale data set learning problem can be solved effectively. Furthermore, different punishments are adopted in allusion to the training subset and the acquired support vectors, which may help to impro...
متن کامل